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ABSTRACT 

Numerous transcription factors self-assemble into 
different order oligomeric species in a way that is 
actively regulated by the cell. Until now, no general 
functional role has been identified for this wide- 
spread process. Here, we capture the effects of 
modulated self-assembly in gene expression with 
a novel quantitative framework. We show that this 
mechanism provides precision and flexibility, two 
seemingly antagonistic properties, to the sensing 
of diverse cellular signals by systems that share 
common elements present in transcription factors 
like p53, NF-kB, STATs, Oct and RXR. Applied to 
the nuclear hormone receptor RXR, this framework 
accurately reproduces a broad range of classical, 
previously unexplained, sets of gene expression 
data and corroborates the existence of a precise 
functional regime with flexible properties that can 
be controlled both at a genome-wide scale and 
at the individual promoter level. 

INTRODUCTION 

A recurrent theme in gene regulation is the self-assembly 
of transcription factors (TF) into coexisting populations 
of dimers, tetramers and other higher order oligomers that 
can bind simultaneously single and multiple DNA sites. 
This behavior has been observed explicitly in the tumor 
suppressor p53 (1), the nuclear factor kB (NF-kB) (2,3), 
the signal transducers and activators of transcription 
(STATs) (4), the octamer-binding proteins (Oct) (5,6), 
and the retinoid nuclear hormone receptor (7) (Table 1). 
In these systems, the properties of self-assembly, and 
the partitioning into low and high order oligomeric 
species, are strongly regulated and modulated by several 
types of signals, such as ligand binding (8), protein 
binding (9,10), acetylation (11) and phosphorylation 



(6,12). The general implications of this modulation, 
however, are not clear. 

At the level of single DNA sites, it is well established 
that the effects of TF are finely determined by their con- 
centration and cognate DNA sequences (13). Processes 
based on interactions with different molecules and 
post-transcriptional modifications are assumed to affect 
mainly the DNA binding properties of the TFs or their 
ability to recruit coregulators. This idea is entrenched in 
the field of gene regulation and is systematically used as a 
guiding principle in the ongoing development of molecular 
therapies against diverse diseases (14). But TFs rarely act 
through just a single binding site (6,15-21) (Table 1). 
Modulated self-assembly (MSA) provides a key mechan- 
ism for controlling the ability of TFs to bind two or more 
DNA sites simultaneously. 

To determine the common wide-ranging effects of 
MSA, we have developed a general quantitative frame- 
work that accurately links MSA with control of gene 
expression (Figure 1). It focuses on the general aspects 
of the core control mechanism shared by the wide 
variety of regulatory systems where MSA is present, 
which include TF self-assembly and its modulation, 
binding of the TF oligomers to DNA, and the resulting 
transcriptional responses. This quantitative framework 
allowed us to uncover modulation of the oligomeric 
states of TFs as a flexible mechanism for precise sensing 
of molecular signals in the presence of intracellular 
fluctuations. Precision ensures that the transcriptional 
response is consistently triggered at a given modulator 
signal strength irrespective of the TF concentration. 
Flexibility allows the precise triggering point to be 
changed, up to several orders of magnitude, both at the 
individual promoter level by changing its DNA sequence 
and at a genome-wide scale by changing the molecular 
self-assembly properties. 

This methodology identified a core set of features 
needed to implement control of transcription by MSA 
that are present in a wide variety of structurally different 



*To whom correspondence should be addressed. Tel: +1 530 752 6700; Fax: +1 530 754 5739; Email: lsaiz@ucdavis.edu 
© The Author(s) 2011. Published by Oxford University Press. 

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ 
by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. 



Nucleic Acids Research, 2011, Vol. 39, No. 16 6855 



Table 1. MSA of TFs 



TF a 


Self-assembly modulation 


Oligomerization states b 


DNA binding 


RXR 


Ligand-binding (8) 


Monomer, dimer*, tetramer* (7) 


1 site, 4 consecutive half-sites (15), 








2 separated sites (16) 


p53 


Protein-binding (9,10), acetylation (11) 


Monomer, dimer, tetramer*, stacked-tetramers* (1) 


1 site, 2 separated half-sites, 








2 separated sites (17,18) 


NF-kB 


Protein-mediated 


Dimer*, tetramer* (2,3) 


2 separated sites (19) 


STAT 


Phosphorylation (12) 


Dimer*, tetramer* (4) 


Tandem sites (20,21) 


Oct 


Phosphorylation (6) 


Monomer*, dimer* (5), tetramer* (6) 


1 site, 2 separated sites (6) 



a For each TF, the table shows the experimentally observed mechanism of the self-assembly modulation process, the oligomerization states involved, 
and the corresponding arrangement of DNA binding sites at the promoter. 

b The symbol * indicates the oligomeric species that have been observed to substantially bind DNA. 
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Figure 1. Quantitative modeling of control of gene expression by modulated self-assembly. Intracellular signals are processed through 'Modulated 
self-assembly' into populations of different oligomeric species that upon 'DNA binding' engage in 'Transcriptional control'. Modulated self-assembly: 
the intensity of a self-assembly modulator signal [s], e.g. ligand or active kinase concentration, regulates the formation of high order oligomers by 
modifying (represented as a yellow spark) the low order oligomers and preventing their self-assembly into the high order species. DNA binding: the 
oligomeric species bound to DNA (in orange/red) are described by their free energies with the statistical weights (Z state ) shown for each binding state 
(expression in black). The parenthesized number, in blue, labels each of the 17 states and the molecular representations illustrate the binding 
combinations of the transcriptional regulator to the two DNA sites (site 1 and site 2). The top left box summarizes the notation. Transcriptional 
control: one state (state 2) can trigger response Rl, in which an enhancer is positioned in the vicinity of the promoter region, and twelve states (states 
6-17) can potentially trigger response R2, in which a coactivator is recruited to the promoter region. Dimers and tetramers have been drawn as 
compositions of the nuclear hormone receptor RXR structures from the PDB files 1BY4 (DNA binding domains bound to the two half-sites on 
DNA, or RXR response elements) and 1G1U (ligand binding domains). 



systems (Table 1). As an exemplar of these systems, we 
have considered explicitly the nuclear hormone receptor 
RXR. In this case, the quantitative framework accurately 
reproduced, in some instances even without free 



parameters, a broad range of classical, previously unex- 
plained gene expression experimental data and 
demonstrated how flexible precise control of gene expres- 
sion can be achieved directly at the molecular level 
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through modulation of the oligomerization state of tran- 
scriptional regulators. 



MATERIALS AND METHODS 

The first step in the signaling cascade orchestrated by 
MSA is the regulation of the relative abundance of 
the oligomerization states of the TF (Figure 1). The 
self-assembly modulator, such as a ligand that binds to 
a TF or a kinase that phosphorylates the TF, affects 
the low-order oligomers to promote or prevent their 
self-assembly. We consider explicitly tetramers, n\, 
dimers, n 2 and non-tetramerizing dimers, n\, as relevant 
high and low order oligomeric species. Other oligo- 
merization pairs, such as octamer-tetramers or 
dimer-monomers, are mathematically equivalent to 
tetramer-dimer s . 

We quantitate the effects of self-assembly modulation 
through the modulator function f([s]) = [ft^/feL which 
describes, in terms of concentrations, the partitioning 
into the tetramerizing and non-tetramerizing dimers by 
the self-assembly modulator, s. This process affects 
dimer and tetramer concentrations, which are related 
to each other through [n 2 ] 2 /[ri4\ = ^td> where is the 
tetramer-dimer dissociation constant. 

The precise form of the modulator function is given by 
the specific mode of action of the modulator. An explicit 
example is f([s]) = M/^ii g for a ligand s that upon binding 
to the dimer n 2 , with dissociation constant K^ g , renders it 
unable to tetramerize in the form n\. Another relevant, 
mechanistically different situation corresponds to 
f([s]) = ^de P hos/(M v phos) for phosphorylation in the linear 
regime of the non-tetramerizing, n\, into the tetramerizing, 
n 2 , dimer species. In this case, [s] is the concentration of 
active kinases and v p h os and /cd ep hos are the phosphoryl- 
ation and dephosphorylation rate constants, respectively. 
In general, several mechanisms can be involved at the 
same time in controlling the oligomerization properties. 
For instance, the case in which the two previous processes 
are combined so that the dimer has to be both free of 
ligand and phosphorylated to be able to tetramerize 
leads to a two-variable modulator function given 

by f([Sl],bpl) = N/^Ug+N^dephos/(^ig[^p]Vphos)+^dephos/ 

([s p ] v p hos) 5 where [si] is the ligand concentration and [s p ] 
is the concentration of active kinases. 

Binding of the different TF oligomers to the DNA 
sites mediates the transcriptional effects of the self- 
assembly modulator (Figure 1). Typically, tetramers and 
both types of dimers bind single DNA sites in a very 
similar way, with free energies AG^ and AG° 2 , for site 1 
and 2, respectively. These quantities are related to 
the corresponding dissociation constants through 
AG° = RT\n(K sl ) and AG° 2 = RT\n(K s2 ). Tetramers, in 
addition, can bind two sites simultaneously because they 
have two DNA-binding domains, one from each of its two 
constituent dimers, which contribute with AG^ and 
AG° 2 to the free energy. The simultaneous binding of 
two domains is typically accompanied by conformational 
changes, e.g. twisting and bending, in both the tetramer 



and DNA (22,23), which contributes with an additional 
conformational term, AG£, to the free energy. Therefore, 
the standard free energy of the state with the tetramer 
bound to two sites is given by AG^+AG^+AG^. This 
conformational contribution has been studied in detail 
in the case of DNA looping by prokaryotic TFs and is 
dependent, among others, on the TF and DNA flexibility, 
the relative position of the DNA-binding sites, and the 
DNA supercoiling state (22,23). 

We use statistical thermodynamics to quantitatively 
describe binding to DNA in terms of free energies and 
concentrations of the different oligomeric species (24-26). 
The key quantity is the statistical weight, or Boltzmann 
factor, defined as Z f = [/i 4 ] ?i N 4 'Kr^ AG?/ ^ which 
relates the relative probability of the binding state i with 
its standard free energy AG°. The exponents tu di and m; 
correspond to the number of tetramers, dimers and 
non-tetramerizing dimers in the state i, respectively. The 
factor RT is the gas constant, R, times the absolute 
temperature, T. The probability of a given group of 
binding states c, P c = J2iec ^> * s obtained by 

adding the statistical weights of its states and normalizing 
by the sum for all the possible states. 

For a system with two binding sites, there are 
17 binding states (Figure 1). These states are those with 
both sites empty; one occupied by a dimer or a tetramer; 
two sites occupied by two dimers, by two tetramers or 
a dimer and tetramer; and two sites occupied simultan- 
eously by a single tetramer. In the case of states with 
dimers, one has to take into account that a dimer can 
either be in the form that allows or prevents tetra- 
merization. In general, each binding state includes a 
constellation of molecular substates with different DNA 
conformations. For instance, the state with both sites 
empty can include a bent DNA conformation, as in the 
case when the two sites are occupied simultaneously by 
a single tetramer, but the lack of a tetramer to stabilize 
the conformation makes this conformation highly 
unlikely. This type of effects has been described in detail 
for other TFs that bind two DNA sites simultaneously, 
such as the lac repressor (27). 

There is also the possibility that oligomerization is so 
weak in solution that it is only observed on DNA. This 
effect can be put in quantitative terms with our framework 
by considering that the state with the tetramer bound 
simultaneously to two DNA sites (Figure 1) can also be 
described as two interacting dimers that bind cooperative- 
ly to DNA. The statistical weight of this state is given by 
Z 2 = [n 4 ]e- {AG °i +AG °2 +AG °cy RT in terms of tetramer con- 
centration and by Z 2 = ([n 2 ] 2 /^td)^" (A ^ 1+AG ° 2+AG c )/i?r in 
terms of dimer concentration, which can be rewritten 
as Z 2 = [n 2 fe-^ G ^ AG0 ^^.J,RT with AG int = AG£+ 

RT\nK t( \. Thus, a very high dissociation constant 
that does not lead to significant tetramerization in 
solution is sufficient to promote tetramerization on 
DNA when the conformational free energy is sufficiently 
low. Intuitively, tetramerization is observed on DNA 
because binding to DNA brings the tetramerization 
domains close to each other and increases their local 
concentration. 
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Table 2. Probability, P c , of the different 


groups of binding states 




P c States a 


Full expression 13 


Simplified expression 0 


1 1 ^ 

Pod 6, 7, 15, 16 


e- AG °c/ RT [n 4 ] 


1 


e- AG c/^^ 4 ]+(e A ^i/^^+[^ 4 ]+[ /l2 ]+[^5])(e AG s2/^^+[^ 4 ]+[ / | 2 ]+[^5]) 

{[n 2 Mn* 2 ])(e AG °i /RT +[n4]) 
e- KG y RT {n A ]+{e AG y RT +[n A ]+{n 2 ^ 


1+ (Nl+Kl) 2 

e" AG c/ /?r [ /l4 ] 

0 


Aio 9, 10, 12, 13 


([n 2 ]+[ni])(e AG ^ RT +[n 4 ^ 
e- AG c/^^ 4 ]+(e A Gy^^+[^ 4 ]+[^ 2 ]+[^])(e AG s2/^^+[^ 4 ]+[« 2 ]+[«*]) 


0 


Pdd 8, 11, 14, 17 


([« 2 ]+K])(N]+K]) 


1 


6>- AG c/^^ 4 ]+(e AG y^ r +^ 4 ]+[/i 2 ]+[^5])(e AG s2/^^+[^ 4 ]+[« 2 ]+[^]) 


l+ e- AG V RT M 
([n 2 ]+[4]) 2 



a States involved in the group as described in Figure 1. 

b The expressions for the probabilities follow from the statistical thermodynamic approach with the free energies of each state as described in 
Figure 1. 

Simplified expressions for the probabilities in the functional regime. 



Two differentiated types of transcriptional responses 
can be constructed from the binding states of the TF on 
DNA (Figure 1). 

The first type, referred to as response Rl, involves 
a high-order oligomer that simultaneously binds two 
non-adjacent DNA sites. Upon binding, the high-order 
oligomer loops out the intervening DNA and positions 
a distal enhancer in the vicinity of the promoter region 
to control transcription. The probability P t of the state 
with the tetramer bound to the two DNA sites simultan- 
eously (Table 2) determines the_ effective transcription 
rate through the expression Tri = r re f(l — P t )+r t P t , 
which weights the transcription rates that the system has 
with, T t , and without, r re f, the distal enhancer close to 
the promoter. 

The second type, denoted here response R2, takes 
advantage of the differentiated recruitment abilities 
of different oligomerization states. This mode of regula- 
tion applies to a coactivator that is recruited by a 
low-order oligomer by binding to a molecular surface 
that is occluded in the high-order oligomer. In this 
case, the effective transcription rate is given by 

Fr2 = r re f(l — Pdo — Pod — ^ > dd)+Fdoi > do+r o di > od+r c id^dd. 

The subscripts do, od and dd of the transcription rates 
T and probabilities P refer to the group of states with 
dimers bound to just site 1, to just site 2 and to both 
sites, respectively (Table 2). r ref is the transcription rate 
with no dimers bound, including empty sites and sites 
occupied by tetramers. 

Responses Rl and R2 embrace the prototypical cases 
mediated by long and short range interactions between 
regulatory elements. They are controlled by the relative 
occupancy of DNA binding sites by the different 
oligomeric species. This mode of functioning differs 
from other systems with multiple binding sites, like the 
lac operon, which are controlled by the absolute occu- 
pancy of their sites by a single oligomeric species (28). 
For instance, IPTG, an inducer of the lac operon, does 
not affect the oligomerization state of the tetrameric lac 



repressor but prevents each of its two DNA binding 
domains from significantly binding their cognate sites 
(28,29). 

RESULTS 

To uncover the unique characteristics that emerge from 
the core structure of MSA in such a general wide variety 
of structurally different systems (Table 1), we focus on a 
functional regime that guarantees that there is response to 
changes in the self-assembly modulator concentration. 
This regime considers two properties. The first one is 
that the TF concentration is sufficiently high for it to sig- 
nificantly bind DNA. In mathematical terms, it implies 
>> e AG °^ RT and hl+N+^l >> e AG y RT . 
The second one is that the tetramer concentration is suf- 
ficiently low, [n4] << [^]+[^2]' so tnat they d° not ta ^ e 

over the binding completely. The reason is that for typical 
values of AG£, tetramers bind more strongly to two DNA 
sites simultaneously than dimers do to a single DNA site 
(27,30,31). 

The key implication of this regime is that the 
probabilities of the different groups of binding states 
simplify in such a way (see Table 2) that the transcription- 
al responses are governed by the reduced expressions 

f ri = r ref (i - P t )+r t P t 
f R2 ~ r ref p t +r dd (i - p t ), 

with 

Pt **l+(l+f(ls])) 2 e AG V RT K td ' (2) 

which show that responses Rl and R2, despite being 
mechanistically different, follow the same control logics. 
In both cases, the two-site binding of the tetramer, 
quantified by P u determines the contributions of the 
reference and activated transcriptional states. The end 
result is even more remarkable because the particular 
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form of P t imparts precision and flexibility to the tran- 
scriptional responses, two properties that are the corner- 
stone of natural gene expression systems but that have 
proved to be highly elusive because of their seemingly 
antagonistic character (13). 

Precision ensures that the transcriptional response is 
consistently triggered at a given modulator signal 
strength irrespective of the particular TF concentration, 
which cancels out in the reduced equations that govern the 
system behavior. Flexibility, on the other hand, allows the 
precise triggering point to be altered, up to several orders 
of magnitude, both at the individual promoter level by 
changing its organization — AG£ depends on the distance 
between the two DNA binding sites (17,22) — and at a 
genome-wide scale by changing the molecular 
self-assembly properties — f([s]) and K t( \ affect the regula- 
tion of all genes in the same way. 

All these results can be observed explicitly in the 
retinoid X receptor (RXR), an exemplar of the essential 
regulators that share the central features of MSA 
(Table 1). RXR controls a large number of genes by 
binding to DNA as homodimer, homotetramer or obliga- 
tory heterodimerization partner for other nuclear recep- 
tors. Nuclear retinoid receptors are highly significant 
because they mediate the pleiotropic effects of retinoic 
acid, which include cell proliferation, differentiation and 
embryonic development and affect the carcinogenic 
process in a number of organs (32). 

The canonical self-assembly modulator of RXR is the 
hormone 9-c/s-retinoic acid (9cRA), a derivative of 
Vitamin A, which binds each RXR subunit independently 
of its oligomerization state (33) and prevents dimers with 
their two subunits occupied from tetramerizing (8). This 
behavior is consistent with ri2 being an apo-dimer and with 
n\ being a holo-dimer, as observed in the respective crystal 
structures of the dimers with no ligand bound (34) and 
with two ligands bound (35). The crystal structure of one 
tetramer with two ligands bound (36) shows that two 
dimers with just one ligand each can form tetramers 
with a structure similar to those of two apo-dimer s. In 
addition to 9cRA, there are other ligands of RXR, as 
for instance, the oleic acid, docosahexaenoic acid, 
methoprene acid and phytanic acid (35). 

These early steps in sensing 9cRA and other ligand con- 
centrations are taken into account by the explicit form of 
the modulator function, which we obtain from the mass 
action law as 



Ms]) 



[n 2 ] Kf.+lKnM' 



(3) 



where K\[ g and [s] are the ligand-RXR dissociation 
constant and the ligand concentration, respectively (see 
Supplementary Data). 

To compare with the experimental data, we normalize 
the fold induction, a measure of relative changes in tran- 
scriptional activity, so that its variation ranges from 0 to 1 . 
This quantity, referred to as normalized fold induction 
(NF1), is defined explicitly as NFI = (FI - l)/(F/ max - 1), 
where FI max is the maximum value of the fold induction 
FI. In terms of the NFI, the results do not depend on 



parameters related to the baseline and maximum expres- 
sion levels and it becomes possible to effectively compare 
experiments on different promoters and cell lines (see 
Supplementary Data). The only parameters needed to 
characterize the shape of the response in the functional 
regime are K\[ g and which have been measured experi- 
mentally, and AG£, which can be inferred by adjusting its 
value to reproduce the experimental data. 

This approach accurately describes the experimental 
observations (16) for the ligand 9cRA and a promoter 
with two non-adjacent DNA binding sites for RXR and 
a distal enhancer (Figure 2A). Simultaneous binding of an 
RXR tetramer to the two sites loops out the intervening 
DNA and brings the enhancer close to the promoter 
region (response Rl). Increasing the concentration of 
9cRA prevents the formation of RXR tetramers and 
leads to deactivation of transcription. 

The very same approach also captures in detail the 
observed behavior when the two DNA binding sites are 
next to each other, as in the classic set of experiments that 
uncovered 9cRA as the cognate ligand of RXR, for dif- 
ferent promoters and cell lines (Figure 2B). In these cases, 
only the dimeric forms of RXR with ligand bound can 
recruit a coactivator (response R2) and increasing the 
concentration of 9cRA results in the activation of 
transcription. The extent of activation is modulated 
by the RXR AF-1 domain and RXR phosphorylation 
(37-39). 

This framework has the much-sought ability to fully 
predict, without free parameters, the responses to different 
ligands from the values of AG£ obtained in response to 
just a single ligand. Applied to the M-trans retinoic acid 
(atRA), which was tested early on as a potential cognate 
ligand of RXR (40,41), the approach closely recapitulates 
its effects on transcription for different cell types and pro- 
moters from the values of AG£ inferred in the responses to 
9cRA (Figure 2C). This ability to fully predict responses 
without free parameters is especially important because it 
provides a direct avenue to transfer specific molecular 
information of the ligand-TF interaction, as described 
by the measured or computed parameters, across scales 
up to the transcriptional effects. 

The high variability of the transcriptional responses, as 
observed in Figure 2, has been a long-standing recurrent 
issue in RXR gene regulation. In particular, the 
half-maximum response point, characterized by the 
EC 5 o, ranges from just above the RXR-ligand dissoci- 
ation constant up to values 30-fold higher (Table 3). 
Our results have identified MSA as a potential mechanism 
to control the EC 50 at the single-gene level through the 
value of AG£ (Table 3). This promoter-dependent flexibil- 
ity indicates that for these systems, the observed variabil- 
ity is not a random aspect of the experimental setup but 
the result of RXR precisely tailoring the response to each 
individual gene. 

The observed variability can be collapsed in the form of 
response landscapes (Figure 3), which represent the tran- 
scriptional activity as a function of the conformational 
free energy in addition to just the usual ligand concentra- 
tion of dose-response curves. The landscapes explicitly 
show the ability of RXR to shape the molecular 
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Response R1 




Table 3. EC 50 control by ligand binding strength and conformational 
free energy 



Response R2 

^=350 



M l B -l-TT ^CT'T . 

10° 10 1 10 2 10 3 10 4 





10° 10 1 10 2 10 3 10 4 



10" 



10° 10 1 10 2 10 3 
[9cRA] (nM) 



10° 10 1 10 2 10 3 10 4 
[atRA] (nM) 



Figure 2. Prediction of RXR-mediated transcriptional responses to 
9cRA and atRA ligands. The results of the model (lines) for the 
functional regime are compared to the normalized fold induction 
(NFI) from experimental data (symbols) for different promoters and 



ligands. The model uses the experimental values K^ t 



InM for 



9cRA (52) or ^ Hg = 350nM for atRA (53), and A: td = 4.4nM (7). 
The conformational free energy AG^ (shown in kcal/mol) is inferred 
from just the experimental data for 9cRA by minimizing the mean 
squared error between model and experiments and the resulting value 
is used subsequently for responses to atRA. (A) Response to 9cRA for 
a system with two separated DNA binding sites for RXR and a distal 
enhancer. Experimental gene expression data is taken from Figure 5b 
of Yasmin et al. (16), which used COS-7 cells transfected with the 
reporter, consisting of double RXRE and a UAS site 300-bp 
upstream, in a vector encoding GAL4-VP16. The NFI was computed 
as NFIr\ = P t (see Supplementary Data) with Equations (2) and (3). 
In this case, the half-maximum response concentration, or EC 50 , is 
about 35 times higher than the 9cRA-RXR dissociation constant. 
(B) Responses to 9cRA for systems with contiguous DNA binding 
sites for RXR. The variability of the dose-response curves, including 
10-fold changes in the EC 50 and different slopes, is accurately captured 
by the model by just adjusting AG^. The three different curves corres- 
pond to three different experimental systems, reported in Figure 5a of 
Heyman et al. (40) (top), which used S2 cells cotransfected with 
the expression plasmid A5C-hRXRa and the reporter plasmid 
ADH-CRBPII-LUC; Figure 4b of Levin et al. (41) (center), which 
used CV-1 cells cotransfected with the reporter CRBPII-RXRE-CAT 
construct and plasmid RXRa; and Figure 5b of Heyman et al. (40) 
(bottom), which used CV-1 cells cotransfected with the expression 
plasmid pRSh-RXRa and the reporter plasmid TK-CRBPII-LUC. 
The NFI was computed as NFI R2 = (l - (l+[5]/^i ig )" 4 )(l - P t ) (see 
Supplementary Data) with Equations (2) and (3). (C) Responses to 
atRA for systems with contiguous DNA binding sites for RXR. The 
three different dose-response curves correspond to the three systems of 
Figure 2B with the all-trans retinoic acid (atRA) as ligand of RXR 
instead of 9cRA. The highly variable dose-response curves are fully 
predicted without free parameters using the values of AG£ inferred in 
Figure 2B. 



EC 50 (nM) 


Ligand 


K H (nM) 


AG° C (kcal/mol) 


287.4 


9cRA 


8 


8.03 


77.8 


9cRA 


8 


9.47 


18.3 


9cRA 


8 


10.76 


14.3 


9cRA 


8 


10.92 


3403.2 


atRA 


350 


9.47 


798.7 


atRA 


350 


10.76 


626.0 


atRA 


350 


10.92 



The EC 50 is defined as the ligand concentration that gives the half- 
maximum response. 



response to ligand binding in a promoter-dependent way. 
The response landscapes show how the EC 50 increases as 
the conformational free energy decreases in a way that 
closely matches the experimental observations (Figure 3). 

To investigate the extent to which typical experi- 
mental conditions fall within the functional regime 
(which, as previously described, is characterized by 
[«4]+[«2]+[«2] >> e AG °^ RT , [n 4 ]+[n 2 ]+[n* 2 ] >> e AG y RT and 
[«4] << [«2]+[«2])' we considered the model for RXR in 
the whole-parameter space. All groups of binding states 
were considered explicitly without simplifications of the 
expressions for the corresponding probabilities (Table 2). 
In addition to the relevant quantities of the functional 
regime, the whole-parameter space includes the experi- 
mentally measured free energies of binding to DNA, the 
RXR dimer-monomer dissociation constant and the 
nuclear RXR concentration. The results (Figure 4) are 
virtually independent of the precise value of the total 
nuclear RXR protein concentration over, at least, a 
10-fold range and accurately capture the diverse dose-re- 
sponse curves observed in the experiments, in agreement 
with the results for the functional regime. In all cases, the 
ranges of concentrations include 550 nM, the estimated 
RXR nuclear concentration in HL-60 cells (7). 
Therefore, the ability to elicit flexible and precise 
responses, as uncovered in the general analysis, is also 
present when the particularities of RXR-mediated tran- 
scriptional responses are taken into account. 



DISCUSSION 

Cellular processes rely on intricate molecular mechanisms 
to function in extraordinarily diverse intra- and extra- 
cellular environments. Eukaryotic gene expression, in 
particular, has shown to be exceedingly complex (42-44). 
Just the core of the transcriptional machinery itself 
involves a wide variety of components with oscillatory 
patterns of macromolecular assembly and phosphoryl- 
ation (45). On top of the constitutive processes, there are 
many other molecular interactions that provide regula- 
tion, enhancing or reducing gene expression and adjusting 
to changing cellular conditions (46). To understand how 
these different levels of molecular complexity contribute to 
the observed behavior, one needs the right approaches 
(47,48). 



6860 Nucleic Acids Research, 2011, Vol. 39, No. 16 



A Response R1 B Response R2 C Response R2 




Figure 3. Local and global flexibility in the response landscapes. The normalized fold induction (NFI) for RXR from the model, computed as in 
Figure 2, is shown as a function of both the self-assembly modulator intensity (either 9cRA or atRA concentration) and the conformational free 
energy AG£ (in kcal/mol). The figures on the bottom are density-plot projections of the corresponding NFI on the top, with dark and light gray 
corresponding to low and high values of the NFI, respectively. The red line corresponds to NFI = 0.5 and shows the dependence of the 
half-maximum response concentration, or EC 50 , with the conformational free energy. The EC 50 can be changed locally, at the single-promoter 
level, by changing the value of AG£, or globally, at a genome-wide scale by changing Kn g , the strength of the ligand binding to RXR. The 3D plots 
show the same experimental data (symbols) as in (A) Figure 2A, (B) Figure 2B, and (C) Figure 2C along with the dose-response curves (black lines) 
for the corresponding values of the conformational free energy. The values of the parameters used are K t d = 4.4 nM (7) and either Kn g = 8nM for 
9cRA (52) or K Ug = 350 nM for atRA (53). 



The quantitative framework we have developed 
provides an efficient avenue to connect the molecular 
properties of MSA with its effects in the control of gene 
expression. This framework allowed us to uncover unique 
properties of control of gene expression by MSA that lead 
to a flexible mechanism for precise sensing of diverse types 
of self-assembly modulation signals, irrespective of 
changes in TF concentration. Application of this method- 
ology to the nuclear hormone receptor RXR accurately 
describes the experimentally observed transcriptional 
responses for both enhancers (response Rl) and 
coactivators (response R2) from just the molecular 
properties of the components (Figures 2A and 2B), and 
successfully predicts the observed behavior without free 
parameters (Figure 2C). A detailed analysis of the 
whole-parameter space reveals that regulation by RXR 
is functioning in a precise regime, with minimal depend- 
ence on RXR nuclear concentration (Figure 4), in which 
the responses are highly diverse as a result of the inherent 
flexibility that accompanies precision in the control of 
gene expression by MSA (Figure 3). 

The observed TF-concentration insensitivity of control 
of gene expression by MSA contrasts with the traditional 
role of RXR as obligatory heterodimerization partner for 
other nuclear receptors, which relies on the absolute 
occupancy of the cognate binding sites by the heterodimer. 



In the case of RXRoc:PPARy regulation of adipogenesis, 
however, it has been observed that several promoters are 
controlled rather by the relative occupancy between 
RXRoc:PPARy heterodimers and other RXRoc hetero- 
dimers or homo-oligomers (49). Our framework provides 
a starting point to consider these more complex situations 
by coupling MSA with hetero-oligomerization and to 
combine these extensions with recent bioinformatics 
methods (50,51) to make accurate predictions on gene 
expression based on the binding profiles observed in the 
experimental data (49). 

The combined presence of flexibility and precision in the 
control of gene expression by MSA, as explicitly shown 
for RXR, allows a single TF to simultaneously regulate 
multiple genes with promoter-tailored dose-response 
curves that consistently maintain their diverse shapes for 
a broad range of the TF concentration changes. These 
features are especially important because essential TFs 
like p53, NF-kB, STATs, Oct and RXR, each of which 
have all the core elements that form the backbone of 
control of gene expression by MSA, regulate multiple 
genes that engage in processes as diverse as cancer, inflam- 
mation, autoimmune diseases and cellular differentiation. 
These results indicate that the prospects for devising more 
effective molecular therapies for systems controlled by 
MSA will greatly benefit from shifting potential 
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intervention points from those that affect absolute concen- 
trations and single-site binding to those that can tackle 
concentration ratios and promoter properties. 
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